Sample and Computationally Efficient Learning Algorithms under S-Concave Distributions

نویسندگان

  • Maria-Florina Balcan
  • Hongyang Zhang
چکیده

We provide new results for noise-tolerant and sample-efficient learning algorithms under s-concavedistributions. The new class of s-concave distributions is a broad and natural generalization of log-concavity, and includes many important additional distributions, e.g., the Pareto distribution and t-distribution. This class has been studied in the context of efficient sampling, integration, and optimiza-tion, but much remains unknown about the geometry of this class of distributions and their applications inthe context of learning. The challenge is that unlike the commonly used distributions in learning (uniformor more generally log-concave distributions), this broader class is not closed under the marginalizationoperator and many such distributions are fat-tailed. In this work, we introduce new convex geome-try tools to study the properties of s-concave distributions and use these properties to provide boundson quantities of interest to learning including the probability of disagreement between two halfspaces,disagreement outside a band, and the disagreement coefficient. We use these results to significantly gen-eralize prior results for margin-based active learning, disagreement-based active learning, and passivelearning of intersections of halfspaces. Our analysis of geometric properties of s-concave distributionsmight be of independent interest to optimization more broadly.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Active and passive learning of linear separators under log-concave distributions

We provide new results concerning label efficient, polynomial time, passive and active learning of linear separators. We prove that active learning provides an exponential improvement over PAC (passive) learning of homogeneous linear separators under nearly log-concave distributions. Building on this, we provide a computationally efficient PAC algorithm with optimal (up to a constant factor) sa...

متن کامل

A The Power of Localization for Efficiently Learning Linear Separators with Noise

We introduce a new approach for designing computationally efficient learning algorithms that are tolerant to noise, and demonstrate its effectiveness by designing algorithms with improved noise tolerance guarantees for learning linear separators. We consider both the malicious noise model of Valiant [Valiant 1985; Kearns and Li 1988] and the adversarial label noise model of Kearns, Schapire, an...

متن کامل

Fourier-Based Testing for Families of Distributions

We study the general problem of testing whether an unknown discrete distribution belongs to a given family of distributions. More specifically, given a class of distributions P and sample access to an unknown distribution P, we want to distinguish (with high probability) between the case that P ∈ P and the case that P is ǫ-far, in total variation distance, from every distribution in P . This is...

متن کامل

Efficient Robust Proper Learning of Log-concave Distributions

We study the robust proper learning of univariate log-concave distributions (over continuous and discrete domains). Given a set of samples drawn from an unknown target distribution, we want to compute a log-concave hypothesis distribution that is as close as possible to the target, in total variation distance. In this work, we give the first computationally efficient algorithm for this learning...

متن کامل

Learning mixtures of structured distributions over discrete domains

Let C be a class of probability distributions over the discrete domain [n] = {1, . . . , n}. We show that if C satisfies a rather general condition – essentially, that each distribution in C can be well-approximated by a variable-width histogram with few bins – then there is a highly efficient (both in terms of running time and sample complexity) algorithm that can learn any mixture of k unknow...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017